A Multi-Environment Cost Evaluator for Parallel Database Systems
نویسندگان
چکیده
In this paper, we investigate issues involved in designing and using a cost model for query optimization in parallel database environments. The large range of possible multiprocessor computers, the different requirements of data-intensive applications to be supported, and the high number of parallel algorithms and information access methods make a multi-environment oriented approach necessary for the design of a cost model. Our proposal follows this approach and introduces a separation between the optimizer and the cost model. A separate tool called the adaptative cost evaluator (ACE) implements the cost model. The cost evaluator is seen as an intelligent black box by the optimizer, and can also be used independently to investigate other problems such as targeted architectures and data clustering. This design leads to the parameterization of the evaluator with libraries describing the environment (architecture, database profile, system, operators and access methods). In addition, the status report of the implementation of the cost evaluator, along with some query evaluation results, is given. Hntroduction New applications such as office automation, stock trading databases, CAD and expert systems highly influence the next generation of information systems. These applications require database management systems (DBMS) more and more efficient in terms of user response time. This goal can be achieved by the improvement of the query compiler, the parallel execution of queries and the use of efficient parallel relational algorithms [BIT83a, VAL84]. The query optimizer is fundamental to obtain high performance. One way to increase its optimization capability is to improve its ability to accurately evaluate query execution time. The role of the query optimizer is to derive an efficient execution plan to get the information requested by the user [VAL90]. This plan specifies all the information (e.g. access DATABASE SYSTEMS FOR ADVANCED APPLICATIONS ‘91 Ed. A. Makinouchi @World Scientific Publishing Co. methods, operation order) to compute a query. The optimizer should be able to optimize simple queries as well as very complex ones. Some applications such as logic programming [KBZ86] introduce expressions with several hundreds of joins. To choose an execution plan, the query optimizer needs to evaluate a number of possibilities with the help of some cost evaluation programs. The cost evaluation is based upon a cost model which enables the estimation of query costs while taking into account specific detail of a target computer and target database. In this paper, we describe a cost evaluator tool appropriate to the design and prototyping of a complete DBMS as part of the EDS’ ESPRIT II project [ANDgOal. This DBMS is to run on a parallel machine with distributed memory [LOP891. The major goal of the project is the production of a high performance, highly parallel database server. The query compilation fully exploits the parallelism available in the EDS architecture. In addition to supporting query optimization, the cost evaluator is used to solve other problems such as data placement and performance prediction. Work on cost modeling in query optimization started with the System R evaluation strategy [SEL79] and with the design of database machine prototypes [SHA79, DEW791 where special-purpose hardware was used to accelerate relational operators. In a number of related works (e.g. [SWA89, WHA87]), the cost model is embedded into the optimizer. In such cases, the cost models are designed for a specific (centralized) environment and could not easily be generalized. One pioneering effort in flexibility is the EXODUS optimizer generator [GRA87]. Nevertheless, the cost model is still linked to the optimizer and this approach falls short in several ways : first, the costs for complex functions (e.g. when non-linear cost are involved or when data are dependent of each other) are difficult to express; secondly, in the case where several methods are used for the same query, it is difficult easily to separate their costs. To overcome these problems, we propose a multienvironment approach for the design of the cost model. The associated tool (called %fbptutive lost %duator) is implemented as a distinct tool, separated from the optimizer. The knowledge about the environment is decomposed into specific libraries (architecture, database profile, system, operator and access method). This knowledge is expressed by 1 EDS : European Declarative System
منابع مشابه
A Cost Evaluator for Parallel Database Systems
The design of ESQL queries Optimizer may be decomposed into three dimensions: (i) the search space which defines the syntactic representation of all relevant aspects of an execution, (ii) the search strategy used to generate an optimal execution plan and (iii) the cost evaluator which calculates the metrics used by the search strategies. In this paper, we investigate issues involved in designin...
متن کاملA Multi Objective Optimization Model for Redundancy Allocation Problems in Series-Parallel Systems with Repairable Components
The main goal in this paper is to propose an optimization model for determining the structure of a series-parallel system. Regarding the previous studies in series-parallel systems, the main contribution of this study is to expand the redundancy allocation parallel to systems that have repairable components. The considered optimization model has two objectives: maximizing the system mean time t...
متن کاملComparing Parallel Simulated Annealing, Parallel Vibrating Damp Optimization and Genetic Algorithm for Joint Redundancy-Availability Problems in a Series-Parallel System with Multi-State Components
In this paper, we study different methods of solving joint redundancy-availability optimization for series-parallel systems with multi-state components. We analyzed various effective factors on system availability in order to determine the optimum number and version of components in each sub-system and consider the effects of improving failure rates of each component in each sub-system and impr...
متن کاملCache Modeling in a Performance Evaluator for Parallel Database Systems
Cache modelling is an important issue in developing an analytical performance evaluator to estimate performance for applications running on parallel DBMSs. This paper describes a cache model developed for parallel cache management in Oracle7 Parallel Server. Some preliminary results have also been obtained by using the cache model to predict cache hit ratios for varying database size and varyin...
متن کاملDeveloping a Multi-objective Mathematical Model for Dynamic Cellular Manufacturing Systems
This paper is in search of designing the cellular manufacturing systems (CMSs) under dynamic and flexible environment. CM is proper for small-to-medium lot production environment that helps the companies to produce variable kind of productions with at least scraps. The most important benefits of CM are decline in material handling, reduction in work-in-process, reduction in set-up time, increme...
متن کامل